Classifier Assignment By Corpus-Based Approach
نویسندگان
چکیده
This paper presents an algorithm for selecting an appropriate classifier word for a noun. In Thai language, it frequently happens that there is fluctuation in the choice of classifier for a given concrete noun, both from the point of view of the whole speech community and individual speakers. Basically, there is no exact rule for classifier selection. As far as we can do in the rule~based approach is to give a default rule to pick up a corresponding classifier of each noun. Registration of classifier for each noun is limited to the type of unit classifier because other types ,are open due to the meaning of representation. We propose a corpus-based method (Biber,1993; Nagao,1993; Smadja,1993) which generates Noun Classifier Associations (NCA) to overcome the problems in classifier assignment and semantic construction of noun phrase. The NCA is created statistically from a large corpus and recomposed under concept hierarchy constraints and frequency of occurrences.
منابع مشابه
cm p - lg / 9 41 10 27 7 O ct 1 99 5 CLASSIFIER ASSIGNMENT BY CORPUS - BASED APPROACH
This paper presents an algorithm for selecting an appropriate classifier word for a noun. In Thai language, it frequently happens that there is fluctuation in the choice of classifier for a given concrete noun, both from the point of view of the whole speech community and individual speakers. Basically, there is no exact rule for classifier selection. As far as we can do in the rule-based appro...
متن کاملروشی جدید جهت استخراج موجودیتهای اسمی در عربی کلاسیک
In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...
متن کاملImprovement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملLearning to Solve QBF
We present a novel approach to solving Quantified Boolean Formulas (QBF) that combines a search-based QBF solver with machine learning techniques. We show how classification methods can be used to predict run-times and to choose optimal heuristics both within a portfolio-based, and within a dynamic, online approach. In the dynamic method variables are set to a truth value according to a scheme ...
متن کاملSemi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کامل